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CLAIMS: 

1 . A method for summarizing at least one multimedia stream (101 , 102), the method 
comprising: 

5 a.) one of receiving and retrieving said at least one multimedia stream 

(101, 102) comprising video, audio and text information; 

b. ) dividing the at least one multimedia stream (101, 102) into a video sub- 
stream (303), an audio sub-stream (305) and a text sub-stream (307); 

c. ) identifying video, audio and text key elements from said video (303), 
10 audio (305) and text (307) sub-streams, respectively; 

d. ) computing an importance value for the identified video, audio and text 
key elements identified at said step (c); 

e. ) first filtering the identified video, audio and text key elements to 
exclude those key elements whose associated importance value is less than a pre-defined 

15 video, audio and text importance threshold, respectively; and 

f. ) second filtering the remaining key elements from said step (e) in 
accordance with a user profile; 

g. ) third filtering the remaining key elements from said step (f) in 
accordance with network and user device constraints; and 

20 h.) outputting a multimedia summary (120) from the key elements 

remaining from said step (g). 

2. The method of Claim 1, wherein said at least one multimedia stream (101, 102) is 
one of an analog and digital multimedia stream. 

25 

3. The method of Claim 1, wherein the step of dividing the at least one multimedia 
stream (101, 102) into a video sub-stream (303) further comprises the step of identifying 
and grouping said at least one multimedia stream (101, 102) into a plurality of news 
stories (330) where each identified news story (330) is comprised of an anchor portion 

30 (3 1 1 , 3 1 2) and a reportage (32 1 , 322) portion. 
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4. The method of Claim 1, wherein the step of dividing the at least one multimedia 
stream (101, 102) into an audio sub-stream (305) further comprises dividing said at least 
one multimedia stream (101, 102) into a plurality of equal-sized frames (306) of a fixed 
time duration. 

5 

5. The method of Claim 1, wherein the step of dividing the at least one multimedia 
stream (101, 102) into a text sub-stream (307) further comprises dividing said at least one 
multimedia stream (101, 102) into a plurality of frames (308) wherein each frame of said 
plurality of frames is defined on a word boundary. 

10 

6. The method of Claim 1, wherein the act of identifying video, audio and text key 
elements from said video (303), audio (305) and text (307) sub-streams further comprise 
the acts of: 

1. ) identifying low (510), mid (710) and high level (910) features from the 
15 plurality of frames which comprise said video (303), audio (305) and text (307) sub- 
streams; 

2. ) determining an importance value to each of said extracted low (5 10), 
mid (710) and high level (910) features from said identifying act; 

3. ) computing a frame importance value for each of said plurality of 
20 frames which comprise said video (303), audio (305) and text (307) sub-streams as a 

function of the importance values of the feature importance values determined at said 
determining act; 

4. ) combining the frames into segments in each of said video (303), audio 
(305) and text (307) sub-streams; 

25 5.) computing an importance value per segment for each segment from 

said combining act; 

6. ) ranking the segments based on said computed importance value at said 
computing step; and 

7. ) identifying key elements based on said ranked segments. 



30 
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7. The method of Claim 6, wherein said act (3) of computing a frame importance 
value for each of said extracted low (510), mid (710) and high level (910) features further 
comprises computing said importance value by one of deterministic, statistical and 
conditional probability means. 

8. The method of Claim 7, wherein said probabilistic means comprises computing 
said frame importance value as one of a Gaussian, Poisson, Rayleigh and Bernoulli 
distribution. 

9. The method of Claim 8, wherein said Gaussian distribution for computing said 
frame importance value is computed as: 



\2k 

where: 8 is any of the features; 

0] is the average of the feature value; and 
82 is the expected deviation. 

10. The method of Claim 7, wherein said deterministic means comprises computing 
said frame importance value as: 

Frame Importance = Z Wjfj 

where : fj represent low, mid-level and high-level features; and 



11. The method of Claim 6, wherein said step (4) of combining the frames into video 
segments further comprises combining said frames by one of family histogram 
computation means and shot change detection means. 

12. The method of Claim 6, wherein said step (4) of combining the frames into audio 
segments further comprises the steps of: 




Wj represent weighting factors for weighting said features. 
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categorizing each frame from said audio sub-stream (305) as one of a speech 
frame, a music frame, a silence frame, a noise frame, a speech + speech frame, a speech + 
noise frame and a speech + music frame; and 

grouping consecutive frames having the same categorization. 

5 

13. The method of Claim 6, wherein act step (4) of combining the frames into text 
segments further comprises combining said frames based on punctuation included in said 
text sub-stream (307). 

10 14. The method of Claim 6, wherein said step (5) of computing an importance value 
per segment further comprises averaging the frame importance values for those frames 
which comprise said segment. 

15. The method of Claim 6, wherein said step (5) of computing an importance value 
15 per segment further comprises using the highest frame importance value in said segment. 

16. The method of Claim 6, wherein said step (7) of identifying key elements based 
on said rankings further comprises identifying key elements whose segment ranking 
exceeds a predetermined segment ranking threshold. 

20 

1 7. The method of Claim 6, wherein said step (7) of identifying key elements based 
on said rankings further comprises identifying key elements whose segment ranking both 
exceeds a predetermined segment ranking threshold and constitute a local maxima. 

25 1 8. The method of Claim 6, wherein said step (7) of identifying key elements based 
on said rankings further comprises identifying key elements whose segment ranking 
constitutes a local maxima. 

19. A system (100) for summarizing at least one multimedia stream (101, 102), 
30 comprising: a modality recognition and division (MRAD) module (103) comprising a 
story segment identifier (SSI) module (103a), an audio identifier (AI) module (103b) and 
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a text identifier (TI) module (103c), the MRAD module (103) communicatively coupled 
to a first external source (1 10) for receiving said at least one multimedia stream (101, 
102), the MRAD module (103) communicatively coupled to a second external source 
(1 12) for receiving said at least one multimedia stream (101 , 102) , the MRAD module 

5 (103) dividing said at least one multimedia stream (101, 102) into a video (303) , an 

audio (305) and a text (307) sub-stream and outputting said video (303) , audio (305) and 
text (307) sub-streams to a KJEI module (105), the KEI module (105) comprising a 
feature extraction (FE) module (107) and an importance value (IV) module (109) for 
identifying key elements from within said video (303), audio (305) and text (307) sub- 

10 streams and assigning importance values thereto, the KEI module (105) communicatively 
coupled to a key element filter (KEF) (1 1 1) for receiving the identified key elements and 
filtering said key elements that exceed a pre-determined threshold criteria, the KEF 
module (111) communicatively coupled to a user profile filter (UPF) (113) for receiving 
filtered key elements and further filtering said filtered key elements in accordance with a 

15 user profile, the UPF module (1 13) communicatively coupled to a network and device 
constraint (NADC) module (1 15), said NADC module (115) receiving said further 
filtered key elements and further filtering said further filtered key elements in accordance 
with network and/or user device constraints, the NADC module 1 1 5) outputting a 
multimedia summary (120) of said at least one multimedia stream (101, 102). 

20 

20. The system of Claim 1 9, further comprising a user preference database (117) 
communicatively coupled to said UPF module (1 13) for storing user profiles. 

21. The system of Claim 19, wherein the first external source (1 10) is a broadcast 
25 channel selector. 

22. The system of Claim 19, wherein the first external source (1 10) is a video 
streaming source. 

30 23. The system of Claim 19, wherein said at least one multimedia stream (101, 102) is 
one of an analog and digital multimedia stream. 
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24. The system of Claim 19, wherein the NADC module (1 15) is communicatively 
connected to an external network (122) coupled to a user device (124). 

5 25. The system of Claim 19, wherein the network (122) is the Internet. 

26. An article of manufacture for summarizing at least one multimedia stream (101, 
102), comprising: a computer readable medium having computer readable code means 
embodied thereon, said computer readable program code means comprising: 
10 an act of one of receiving and retrieving said at least one multimedia 

stream (101, 102) comprising video, audio and text information; 

an act of dividing said at least one multimedia stream (101, 102) into a 
video sub-stream (303), an audio sub-stream (305) and a text sub-stream (307); 

an act of identifying video, audio and text key elements from said video 
15 (303), audio (305) and text (307) sub-streams, respectively; 

an act of computing an importance value for the identified video, audio 
and text key elements identified at said identification act; 

an act of first filtering the identified video, audio and text key elements to 
exclude those key elements whose associated importance value is less than a pre-defined 
20 video, audio and text importance threshold, respectively; and 

an act of second filtering the remaining key elements from said first 
filtering act in accordance with a user profile; 

an act of third filtering the remaining key elements from said second 
filtering act in accordance with network and user device constraints; and 
25 an act of outputting a multimedia summary (120) from the key elements 

remaining from said third filtering act. 



30 



27. The article of manufacture of Claim 26 further wherein the act of identifying 
video, audio and text key elements from said video (303), audio (305) and text (307) sub- 
streams, respectively, further comprises: 
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an act of identifying low (510), mid (710) and high level (910) features 
from the plurality of frames which comprise said video (303), audio (305) and text (307) 
sub-streams; 

an act of determining an importance value to each of said extracted low 
(510), mid (710) and high level (910) features from said identifying act; 

an act of computing a frame importance value for each of said plurality of 
frames which comprise said video (303), audio (305) and text (307) sub-streams as a 
function of the importance values of the feature importance values determined at said 
determining step; 

an act of combining the frames into segments in each of said video (303), 
audio (305) and text (307) sub-streams; 

an act of computing an importance value per segment for each segment 
from said combining act; 

an act of ranking the segments based on said computed importance value 
at said computing act; and 

an act of identifying key elements based on said ranked segments. 



